Incorporating Pronunciation Variation into Different Strategies of Term Transliteration

نویسندگان

  • Jin-Shea Kuo
  • Ying-Kuei Yang
چکیده

Term transliteration addresses the problem of converting terms in one language into their phonetic equivalents in the other language via spoken form. It is especially concerned with proper nouns, such as personal names, place names and organization names. Pronunciation variation refers to pronunciation ambiguity frequently encountered in spoken language, which has a serious impact on term transliteration. More than one transliteration variants can be generated by an out-of-vocabulary term due to different kinds of pronunciation variations. It is important to take this issue into account when dealing with term transliteration. Several models, which take pronunciation variation into consideration, are proposed for term transliteration in this paper. They describe transliteration from various viewpoints and utilize the relationships trained from extracted transliterated-term pairs. An experiment in applying the proposed models to term transliteration was conducted and evaluated. The experimental results show promise. These proposed models are not only applicable to term transliteration, but also are helpful in indexing and retrieving spoken document retrieval.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating Pronunciation Variation into Extraction of Transliterated-term Pairs from Web Corpora

A novel approach to automatically extracting transliterated-term pairs from Web corpora is proposed in this paper. One of the most important issues addressed is that of taking pronunciation variation into account. Pronunciation variation is a phenomenon of pronunciation ambiguity that seriously affects the term transliteration and hence affects those results produced by transliteration processe...

متن کامل

Generating Paired Transliterated-cognates Using Multiple Pronunciation Characteristics from Web corpora

A novel approach to automatically extracting paired transliterated-cognates from Web corpora is proposed in this paper. One of the most important issues addressed is that of taking multiple pronunciation characteristics into account. Terms from various languages may pronounce very differently. Incorporating the knowledge of word origin may improve the pronunciation accuracy of terms. The accura...

متن کامل

Effects of Related Term Extraction in Transliteration into Chinese

To transliterate foreign technical terms and proper nouns, in Japanese and Korean, phonograms, such as Katakana and Hangul, are used. In Chinese, the pronunciation of a source word is spelled out with Kanji characters. However, because Kanji comprises ideograms, different Kanji are associated with the same pronunciation, but can potentially convey different meanings and impressions. In this pap...

متن کامل

Modeling Impression in Probabilistic Transliteration into Chinese

For transliterating foreign words into Chinese, the pronunciation of a source word is spelled out with Kanji characters. Because Kanji comprises ideograms, an individual pronunciation may be represented by more than one character. However, because different Kanji characters convey different meanings and impressions, characters must be selected carefully. In this paper, we propose a transliterat...

متن کامل

Morphological Cross Reference method for English to Telugu Transliteration

Machine Transliteration is a sub field of Computational linguistics for automatically converting letters in one language to another language, which deals with Grapheme or Phoneme based transliteration approaches. Several methods for Machine Transliteration have been proposed till date based on nature of languages considered, but those methods are having less precision for English to Telugu tran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004